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Abstract 

Background, Aims and Scope. In 1995, the Center for Trans¬ 
portation Research (CTR) of Argonne National Laboratory 
(ANL) began to develop a model, called GREET (Greenhouse 
gases, Regulated Emissions, and Energy use in Transportation), 
for estimating the full fuel-cycle energy and emissions impacts of 
alternative transportation fuels and advanced vehicle technolo¬ 
gies. The parametric assumptions used in the GREET model in¬ 
volve uncertainties. A new stochastic simulation tool, developed 
by Vishwamitra Research Institute (VRI), is built into the GREET 
model to address uncertainties. This paper presents the method¬ 
ology and features of this new stochastic simulation tool and evalu¬ 
ates the performance of the sampling techniques in the tool. 

Methods. The new tool is interfaced through the graphical user 
interface (GUI) to perform the stochastic simulation. In gen¬ 
eral, five steps need to be followed to run a complete simula¬ 
tion: 1) Specify probability distribution functions; 2) Indicate 
the number of samples and the sampling technique; 3) Define 
the forecast variables; 4) Delete distribution functions (if neces¬ 
sary); and 5) Propagate the uncertainties and statistically ana¬ 
lyze the outputs. The GREET model contains more than 700 
default distribution functions for a wide variety of key param¬ 
eters and as many as 3000 forecast variables. The stochastic 
simulation tool has been developed to incorporate 11 probabil¬ 
ity distribution function types for representing uncertain param¬ 
eters and four sampling techniques (Monte Carlo sampling 
[MCS], Elammersley Sequence sampling [HSS], Latin Hypercube 
sampling [LHS] and Latin Hypercube Hammersley sampling 
[LHHS]) for stochastic simulation. To evaluate the performance 
of the four sampling techniques, 16 independent stochastic simu¬ 
lation runs were conducted in GREET and the output results 
were analyzed and compared. 

Results and Discussion. With the same number of samples, the 
output distribution curve simulated by HSS is the smoothest 
corresponding to the highest level of uniformity. To achieve the 
same level of smoothness as HSS with 1,000 samples, LHHS 
needs to be simulated with -1500 samples and LHS and MCS 
with -3,000 samples. As a result, HSS can achieve more than 


200% reduction in running time compared to LHS or MCS 
without compromising the accuracy and quality of the predic¬ 
tion curves. The simulated mean values are dose enough to 
the actual mean value (within ±1%) despite the selection of 
sampling technique and the number of samples (between 1,000 
and 4,000). The standard deviation values from each other are 
close enough as well (within ± 5 %). It shows the trend that the 
increasing number of samples makes the simulated mean value 
marginally closer to the actual mean value; however, the im¬ 
provement effect is negligible. The simulation time is strictly 
positive-correlated with the number of samples; therefore, the 
trade-off between extending simulation time and improving 
the smoothness of the output distribution curve needs to be 
carefully assessed. 

Conclusion. A new stochastic simulation tool has been devel¬ 
oped to be built into Argonne's GREET model to enhance its 
capability for addressing uncertainty. This new tool guides the 
user in each step of the process through the user-friendly GUI 
windows. According to the performance comparison among the 
four sampling techniques, HSS was found to be the most effi¬ 
cient technique. Therefore, HSS was set as the default technique 
in GREET. 


Keywords: Distribution function; GREET model; sampling tech¬ 
nique; stochastic simulation; uncertainty; well-to-wheels analysis 


Introduction 

Uncertainties are inherent in life and we have learnt to deal 
with them by evolving cognitive heuristics and developed 
strategies. However, as the system becomes complicated, 
involving more decisions and much higher stakes, heuristics 
becomes obsolete and mathematical models are required 
(Subramanyan and Diwekar 2005a). In situations involving 
uncertainties, a deterministic approach to solve the problem 
would produce results which might not be a true reflection 
of reality and in such cases stochastic simulations incorpo¬ 
rating uncertainties need to be performed (Diwekar 2003a). 
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In 1995, with funding from the U.S. Department of Energy 
(DOE), the Center for Transportation Research (CTR) of 
Argonne National Laboratory (ANL) began to develop a 
model for estimating the full fuel-cycle energy and emis¬ 
sions impacts of alternative transportation fuels and ad¬ 
vanced vehicle technologies (Wang 1996). The intent was 
to provide an analytical tool to allow researchers to readily 
analyze various parametric assumptions that affect fuel-cycle 
energy use and emissions associated with various fuels and 
vehicle technologies. The model, called GREET (Greenhouse 
gases, Regulated Emissions, and Energy use in Transporta¬ 
tion), calculates fuel-cycle [often called well-to-wheels, 
WTW1 energy use in Btu/mi and emissions in g/mi for vari¬ 
ous transportation fuels and vehicle technologies. For en¬ 
ergy use, GREET includes total energy use (all energy 
sources), fossil energy use (petroleum, natural gas [NG], 
and coal), and petroleum use. For emissions, the model in¬ 
cludes three major greenhouse gases (GHGs) (carbon diox¬ 
ide [C0 2 ], methane [CH 4 ], and nitrous oxide [N,0]), and 
five criteria pollutants (volatile organic compound [VOC], 
carbon monoxide [CO], nitrogen oxides [NO x ], particulate 
matter with a diameter of 10 micrometers or less [PM 10 ], 
and sulfur oxides [SOJ). Since the release of the first ver¬ 
sion of GREET (GREET1.0) in 1996, Argonne continues to 
update and upgrade the model. In November 2005, the most 
recent version - GREET1.7 - was released. The new ver¬ 
sion reflects many new efforts conducted by Argonne dur¬ 
ing the last several years, including many new features, new 
fuel/vehicle pathways, and up-to-date information regard¬ 
ing energy use and emissions for fuel production activities 
and vehicle operations (Wang et al. 2005). Fig. 1 shows the 
GREET WTW modeling boundary. The GREET model is in 
the public domain, and any party can use it free of charge. 
The model and its associated documents are posted at 
Argonne's GREET website: http://www.transportation.anl. 
gov/software/GREET/index.html. 

The GREET model incorporates a large number of input 
parameters and a wide variety of output results. Many of 
the input parameter assumptions involve uncertainties, 
which require probability distributions to represent the trend 
of occurrence of the parameter over a specific range that 
define the uncertainty (General Motors Corporation et al. 
2001, Wang 2002, Brinkman et al. 2005, Wu et al. 2006). 


Since the parameters in GREET are uncertain, the resulting 
output variables consequently have to be represented by 
distributions. To address these uncertainties, a new stochas¬ 
tic simulation tool, developed by Vishwamitra Research In¬ 
stitute (VRI), is built into the GREET model. This tool has 
been built as a Microsoft® Excel add-in file with Visual 
Basic macros which can be loaded whenever the user needs 
to perform a stochastic simulation within the model. This 
new tool automates the process of setting up a stochastic 
simulation to a great extent and guides the user in each 
step of the process through the user-friendly graphical user 
interface (GUI) windows. It incorporates four sampling tech¬ 
niques including the new and efficient Hammersley Sequence 
Sampling (HSS) and Leaped HSS ( variant of HSS) (Diwekar 
2003) and an inbuilt bank of as many as 11 probability 
distribution function types for representing uncertain pa¬ 
rameters. The following sections of this paper present the 
methodology and features of this new stochastic simula¬ 
tion tool, and compare the performance of the four sam¬ 
pling techniques for selected GREET output variables. 

1 Methodology and Feature of the Stochastic 
Simulation Tool 

This tool is interfaced through a command bar with five 
buttons as shown in Fig. 2 which perform a complete sto¬ 
chastic simulation with the following functions: 

a) Cell Input: Specify probability distribution functions to 
the input variables; 

b) Sampling: Indicate the number of samples required and 
the sampling technique to be used; 

c) Forecast Cells: Define the forecast variables; 

d) Delete Distribution: To delete a distribution defined pre¬ 
viously; 

e) Run Simulation: Propagate the uncertainties and statis¬ 
tically analyze the outputs. 


Stochastic Simulation 


Cell Input Sampling Forecast Cells Run Simulation Delete Distribution 


Fig. 2: Stochastic simulation command bar 



Fig. 1 : GREET well-to-wheels modeling boundary for fuel/vehicle systems 
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1.1 Cell Input 

The first button, 'Cell Input', is for the specification of input 
probability distribution for each uncertain variable. Select 
one of the parametric assumption cells for which a prob¬ 
ability distribution is to be specified and click on 'Cell In¬ 
put', a pop-up menu is provided. The gallery window con¬ 
tains as many as 11 probability distribution function types. 
They are: 1) Normal, 2) Lognormal, 3) Uniform, 4) Trian¬ 
gular, 5) Weibull, 6) Beta, 7) Gamma, 8) Extreme Value, 9) 
Exponential, 10) Pareto, and 11) Logistic. For details on 
each probability distribution function type please refer to 
Subramanyan and Diwekar (2005b). The user can select a 
type of distribution and click 'OK.' An input parameter 
specification window for the particular distribution opens 
up. Once a cell has been assigned an input distribution, it 
turns green. 

Fig-3 gives an example input parameter specification win¬ 
dow for the GREET default distribution function of a vari¬ 
able, natural gas (NG) recovery efficiency. There are four 
options in the input specification window: 

a) Input Specification Options Frame: This option, at the 
right hand side, consists of radio buttons which can be used 
to select the type of inputs. As seen from the figure, normal 
distribution requires two input parameters, which can be 
selected from one of the following five input specification 
choices: i) Mean value and standard deviation; ii) 1 st and 
99 th percentile; iii) 5 th and 95 th percentile; iv) 10 th and 90 th 
percentile; and v) 20 th and 80 th percentile. A 'percentile' can 
be defined as a score location below which a specified per¬ 
centage of the population falls. When the inputs are in terms 
of percentile, the code automatically estimates the values of 
the mean and standard deviation. When inputs are defined 
in terms of percentiles, care should be taken to provide fea¬ 
sible percentile values. 


b) Input Parameters Boxes: These boxes are above the con¬ 
trol buttons. Once the type of input parameter is selected, 
the selected parameter automatically appears beside the in¬ 
put specification boxes. For example, in Fig. 3, the 20 th and 
80 th input specification option has been selected and so they 
appear as labels of the input text boxes. Here, 0.96, the P20 
value, means that there is a probability of 20% that the ac¬ 
tual NG recovery efficiency value would be equal to or be¬ 
low 96%. 

c) Minimum and Maximum Cut-off Specification Boxes: The 
default minimum and maximum cut-off values, in case of 
the normal distribution are '-Infinity' and '+Infinity,' respec¬ 
tively. These values are used in case you want to sample 
from the whole distribution. If you want to truncate the 
distribution so that samples cannot be less or greater than a 
particular value, you can truncate the distribution by speci¬ 
fying the particular values in these boxes. For example, the 
energy efficiency cannot be greater than 1; therefore, the 
maximum value of the distribution has to be specified as 1 
and the plot is truncated at this value (see Fig. 3). 

d) Probability Distribution Function Plot: Once the input 
parameters for the probability distribution has been speci¬ 
fied, you can visualize the shape of the plot by clicking on 
the button captioned 'Enter.' The plot is automatically re¬ 
drawn according to the current input parameters. This is 
useful if you want to see the variation in the plot for various 
input parameters. The plot window also has a mean value 
line that specifies the mean value of the probability distribu¬ 
tion function. 

The new version, GREET1.7, contains more than 700 de¬ 
fault distribution functions for key parameters, such as en¬ 
ergy efficiencies, GHG emission factors, as well as criteria 
pollutant emission factors, for each WTW stage. To accom¬ 
plish this, the data from each source type were read into 
Crystal Ball™, a statistical software which, based on the 


Normal Distribution Parameters 


Mean value line 
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Mean value 


20th percents - 



Parameter Input Options 
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Input specification options 


Fig. 3: An example of input specification window for the GREET input variable 'NG recovery efficiency' 
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number of data points and scatter of the data, attempts to 
fit a distribution about the data for that source type. In Crys¬ 
tal Ball ™, a mathematical fit is performed to determine the 
set of parameters for each set of standard distribution func¬ 
tions that best describes the characteristics of the data. The 
quality or closeness of each fit is determined using a Chi- 
squared test. All distributions were also visually examined 
for reasonableness. Ideally, this method should be employed 
for each cell with distribution function. However, limited 
data availability often prevented us from taking the statisti¬ 
cal approach. In these cases, judgments were made to de¬ 
velop subjective distribution functions. For example, we only 
have limited data for the key parameter, density of conven¬ 
tional crude oil. In this case, we decided to use triangular 
distribution function for this parameter with minimum, 
maximum and most likely values. However, when new data 
come available, we will employ the statistical approach to 
improve the distribution quality of these cells. The detailed 
discussion on the methodology and data sources for the de¬ 
fault distribution function database in the GREET model 
could be found in General Motors Corporation et al. (2001), 
Wang (2002), Brinkman et al. (2005), and Wu et al. (2006). 

1.2 Sampling 

Once the distribution functions for all the uncertain param¬ 
eters have been specified, the next step is to specify the sam¬ 
pling technique to be used and the number of samples re¬ 
quired. When the user clicks on 'Sampling' in the stochastic 
simulation command bar, a window appears. The user can 
select from one of four sampling techniques: 1) Hammersley 
Sequence sampling (HSS) (or leaped HSS for dimension >15); 
2) Monte Carlo sampling (MCS); 3) Latin Hypercube sam¬ 
pling (LHS); 4) Latin Hypercube Hammersley sampling 
(LHHS) (or leaped LHHS for dimension >15). 

1.2.1 Monte Carlo sampling 

One of the most widely used techniques for sampling from 
a probability distribution is the Monte Carlo sampling tech¬ 
nique, which is based on a pseudo-random generator used 
to approximate a uniform distribution (i.e., having equal 
probability in the range from 0 to 1). The specific values for 
each input variable are selected by inverse transformation 
over the cumulative probability distribution. A Monte Carlo 
sampling technique also has the important property that the 
successive points in the sample are independent. 

1.2.2 Latin Hypercube sampling 

The main advantage of the Monte Carlo method lies in the 
fact that the results from any Monte Carlo simulation can 
be treated using classical statistical methods; thus results can 
be presented in the form of histograms, and methods of sta¬ 
tistical estimation and inference are applicable. Neverthe¬ 
less, in most applications, the actual relationship between 
successive points in a sample has no physical significance; 
hence the randomness/independence for approximating a 
uniform distribution is not critical (Knuth 1973). Moreover, 
the error of approximating a distribution by a finite sample 
depends on the equidistribution properties of the sample used 


for U(0,1) rather than its randomness. Once it is apparent 
that the uniformity properties are central to the design of 
sampling techniques, constrained or stratified sampling tech¬ 
niques become appealing (Morgan and Henrion 1990). Latin 
Hypercube sampling is one form of stratified sampling that 
can yield more precise estimates of the distribution func¬ 
tion. In Latin Hypercube sampling, the range of each uncer¬ 
tain parameter Xi is sub-divided into non-overlapping inter¬ 
vals of equal probability. One value from each interval is 
selected at random with respect to the probability distribu¬ 
tion in the interval. The 'n' values thus obtained for XI are 
paired in a random manner (i.e., equally likely combina¬ 
tions) with 'n' values of X2. These 'n' values are then com¬ 
bined with n values of X3 to form n-triplets, and so on, 
until n k-tuplets are formed. In median Latin Hypercube 
(MLHS) this value is chosen as the mid-point of the inter¬ 
val. MLHS is similar to the descriptive sampling described 
by Saliby (1990). The main drawback of this stratification 
scheme is that, it is uniform in one dimension and does not 
provide uniformity properties in k-dimensions. 

1.2.3 Hammersley Sequence sampling 

In late the 1990s, an efficient sampling technique, HSS, based 
on Hammersley points was developed (Kalagnanam and 
Diwekar 1997). It uses an optimal design scheme for plac¬ 
ing the n points on a k-dimensional hypercube. This scheme 
ensures that the sample set is more representative of the 
population, showing uniformity properties in multi-dimen¬ 
sions, unlike Monte Carlo, Latin Hypercube, and its vari¬ 
ant, the Median Latin Hypercube sampling techniques. Fig. 4 
graphs the samples generated by different techniques on a 
unit square. This provides a qualitative picture of the uni¬ 
formity properties of the different techniques. It is clear from 
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Fig. 4: Sampling points (100) on a unit square using various sampling 
techniques 
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Fig. 4 that the Hammersley points have better uniformity 
properties compared to other techniques. The main reason 
for this is that the Hammersley points are an optimal design 
for placing n points on a k-dimensional hypercube. In con¬ 
trast, other stratified techniques such as the Latin Hypercube 
are designed for uniformity along a single dimension and 
then randomly paired for placement on a k-dimensional cube. 
Therefore, the likelihood of such schemes providing good 
uniformity properties on high dimensional cubes is extremely 
small. One of the main advantages of Monte Carlo methods 
is that the number of samples required to obtain a given 
accuracy of estimates does not scale exponentially with the 
number of uncertain variables. HSS preserves this property 
of Monte Carlo. For correlated samples, the approach uses 
rank correlations to preserve the stratified design along each 
dimension. Although this approach preserves the uniformity 
properties of the stratified schemes, the optimal location of 
the Hammersley points are perturbed by imposing the cor¬ 
relation structure. Although the original HSS technique de¬ 
signs start at the same initial point, it can be randomized by 
choosing the first prime number randomly. The HSS tech¬ 
nique is much faster than LHS and Monte Carlo techniques 
and hence is a preferred technique for uncertainty analysis 
as well as optimization under uncertainty. 

1.2.4 Latin Hypercube Hammersley sampling 

In this sampling technique, we have used the k-dimensional 
uniformity of HSS and one dimensional uniformity of LHS to 
obtain a new sampling technique called Latin Hybercube 
Hammersley Sampling (Wang et al. 2004). In the process of 
generating samples with LHHS, the sample values of each in¬ 
put variable are first generated using LHS. The next step is to 
pair them and combine the input vectors. The conventional 
method is to pair all of them randomly. However, the sample 
correlation matrix of input variables generated by either LHS 
or MCS with random pairing processes is not exactly equal to 
I and it also shows bad uniformity. Hence restricted pairing 
procedure is used in all cases. Even when the input variables 
are independent, the restricted pairing procedure is still em¬ 
ployed for the desired correlation matrix I to make sure there 
is no actual dependence among the input variables. Kalag- 
nanam and Diwekar (1997), Diwekar and Kalagnanam (1996, 
1997), and Diwekar (2003b) already showed that Hammersley 
sequence points have better multidimensional uniformity. In 
order to characterize the new sampling technique this prop¬ 
erty, the HSS matrix H(Nxk ) corresponding to van der 
Waerden scores matrix in Iman and Conover's approach in 
LHS, is used in pairing procedures. 

To avoid the problem associated with h(n xk) not having a 
correlation matrix equal to I, the sample correlation matrix 
R(k x k ) associated with H(Nxk ) is used to find a matrix S 
so that 

SRS t =C (!) 

where C is the desired sample correlation matrix. The same 
as above, the Cholesky factorization is used to find a lower 
triangular matrix Q such that 


QQ T = R (2) 

Therefore, the solution of S can be found, which is given by 

S = PQ l 0) 

And correspondingly the transformation factor for the rank 
matrix is changed to S and the rank matrix becomes 

H* = HS t (4) 

The correlation matrix of H * is exactly equal to the desired 
correlation matrix C. The sample can therefore be paired 
according to the new rank matrix H * rather than H. In this 
pairing process, when a correlation structure is not speci¬ 
fied, variance of inflation factor (VIF), defined as the largest 
element on the diagonal of the inverse of the correlation 
matrix, is computed to detect the large pairing correlations. 
As the VIF gets much larger than 1, there may be some un¬ 
desirably large pairing correlations. For VIF large than 10, 
there can be serious collinearity (Marquardt 1970, Mar- 
quardt and Snee 1975). It has been found that the perfor¬ 
mance of LHHS is most of the time better than HSS. How¬ 
ever, unlike MCS or HSS, the performance measure for 
LHHS is not independent of number of variables or type of 
functionality used to compute the output distributions. 

1.2.5 Leaped HSS and LHHS 

It has been recently found that the uniformity property of 
HSS for higher dimensions (more than 30 uncertain vari¬ 
ables) gets distorted. HSS and LHHS are generated based 
on prime numbers as bases. In order to break this distor¬ 
tion, we introduced leaps in prime numbers for higher di¬ 
mensions. This leaped HSS and LHHS showed better uni¬ 
formity than the basic HSS and LHHS. For simplicity, we 
have leaped HSS and LHHS as a part of the HSS and LHHS 
techniques in the GREET stochastic modeling capability. 
When the number of uncertain variables exceeds 15, the 
switch occurs automatically. 

1.3 Forecast cells 

The next step is to select those variables whose values will 
be forecasted. GREET1.7 includes more than 90 fuel pro¬ 
duction pathways and more than 70 vehicle/fuel systems 
(Wang et al. 2005). Therefore, the user can have as many as 
approximately 3,000 forecast variables for stochastic simu¬ 
lation. A special algorithm has been created to enable you 
to easily select the forecast variables for the pathways of 
interest through four simple steps: 

a) Select the vehicle technologies. 

b) Specify the transportation fuels. 

c) Specify the well-to-wheels (WTW) simulations and/or 
well-to-pump (WTP) simulations. 

d) Select the energy and emission forecast groups. 
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The naming convention for the forecast variables is 'Vehicle 
Technology - Transportation Fuel - WTW and/or WTP - 
Energy and Emission Forecast'. For example, 'CIDI-DME- 
WTW-N20' can be interpreted as the well-to-wheels total 
N,0 emissions for the compression-ignition direct-injection 
(CIDI) vehicle fueled with dimethyl ether (DME). The fore¬ 
cast variables listed in the list box titled 'Selected Forecasts' 
are those which would be predicted at the end of the sto¬ 
chastic simulation. 

1.4 Delete distributions 

For any parametric assumption cell with a probability dis¬ 
tribution, if the user decides to just assign a point value to 
that cell, the probability distribution can be deleted by se¬ 
lecting the cell and clicking on the 'Deleted Distribution' 
button. The input distribution is automatically deleted and 
the cell color turns from green to white. 

1.5 Run simulation 

After all the required inputs and forecast selections have been 
finalized, the 'Run Simulation' button is enabled to click to 
begin execution of the stochastic simulation. After the simu¬ 
lation run is completed, the forecast results are exported to 
another Excel file and statistical values like the mean, stan¬ 
dard deviation, and 0 th to 100 th percentile are calculated 
automatically for each forecast variable. 

2 Results and Discussion 

The fuel-cell vehicle (FCV) fueled with liquid hydrogen (TH2) 
was chosen for a case study to evaluate the performance of 
the four sampling techniques. The stochastic simulation was 


Table 1: CPU time (in hrs.min.sec) for each stochastic simulation on a 
2.39GHz processor 


Sampling 

techniques 

Number of samples 

1000 

2000 

3000 

4000 

HSS 

0.22.38 

0.44.57 

1.07.13 

1.31.02 

MCS 

0.22.00 

0.45.17 

1.08.23 

1.29.41 

LHS 

0.21.31 

0.41.56 

1.03.21 

1.25.09 

LHHS 

0.21.22 

0.41.47 

1.02.50 

1.24.13 


Table 2: Mean and standard deviation (S.D.) values of forecast variable 
'WTW total energy use of LH2 FCV' (1000 samples) 


Sampling 

techniques 

Total energy use: Btu/mi 

Mean 

S.D. 

HSS 

5853.98 

1029.25 

MCS 

5837.51 

1047.53 

LHHS 

5857.13 

1029.63 

LHS 

5858.70 

1050.17 


run with each sampling technique and the output results 
were analyzed and compared. Table 1 presents the central 
processing unit (CPU) times taken for 16 independent runs 
varying the number of samples and the sampling technique. 

Figs. 5 (a) through (d) show the output distribution curves 
for a particular forecast variable which is the WTW total 
energy use of TH2 FCV from four stochastic runs each with 
1000 samples. The four charts correspond to MCS, HSS, 
LHHS and LHS, respectively. The comparison of the output 
distribution curves substantiates the uniformity comparisons 
in Fig. 4. The CPU times for all these four runs are almost 
equal as seen from Table 1 but the HSS curve is the smooth- 



Fig.5 : The output distribution curves for the forecast variable 'WTW total energy use of LH2 FCV' with 1,000 samples using: a) MCS; b) HSS; c) LHHS; 
and d) LHS 
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Fig. 6: The output distribution curves for the forecast variable 'WTW total 
energy use of LH2 FCV 1 using LHHS with 1,500 samples 

est corresponding to the highest level of uniformity, followed 
by LHHS and while MCS and LHS are almost the same 
with respect to smoothness factor. Table 2 lists the mean 
values and the standard deviation of all four sampling tech¬ 
niques for this forecast. The interesting fact here is that the 
means from all sampling techniques are within 0.3% of each 
other and the standard deviation of the HSS and LHHS tech¬ 
niques are only 2% lower than the other two. Therefore, 
even though the mean values and variances are all close to 
each other, the Figs. 5 (a)-(d) show that 1,000 samples are 
optimal for HSS to obtain a relatively smooth curve but not 
for MCS, LHHS and LHS. 

Fig. 6 illustrates the output distribution curve as the number 
of samples for LHHS was increased to 1,500 with all other 
parameters remaining the same. It can be clearly seen that 
the curve is considerably smoother than the 1,000 sample 
case (see Fig. 5 (c)) though at a cost of CPU time which has 
increased to about 33 min. The mean value and standard 
deviation with 1,500 samples were computed as 5856.38 
and 1028.42, respectively, with the differences both less than 
0.1% compared to the results with 1,000 samples. 

Fig. 7 (a) illustrates the output distribution curve as the num¬ 
ber of samples for LHS was increased to 2,000 with all other 
parameters remaining the same. As expected, the smooth¬ 
ness of the curve has improved with the increase in number 
of samples though still not as smooth as that of HSS or 
LHHS. This means that LHS needs more samples to reach 
the smoothness level of HSS and LHHS, for example, 3,000 
samples (see Fig. 7 (b)) or more. However, the CPU time has 


increased significantly to 42 min for 2,000 samples and 
63 min for 3,000 samples (see Table 1). Therefore, the user 
needs to assess the trade off to balance the number of samples 
and simulation time for LHS. Again, the computed mean 
values and standard deviation values for 2,000 and 3,000 
samples show negligible effects on the mean and variance 
no matter the samples are doubled or tripled. 

As a test, we did LHS and MCS simulations with 4,000 
samples (the distribution curves are not shown in the pa¬ 
per). Our finding is the marginal improvement in smooth¬ 
ness of the output distribution curve from 3,000 samples to 
4,000 samples is not as much as that from 1,000 samples to 
2,000 samples and 2,000 samples to 3,000 samples. How¬ 
ever, the time of CPU is reaching as long as one and a half 
hour if the user selects 4,000 samples. We noticed the in¬ 
creasing number of samples did make the simulated mean 
value marginally closer to the actual mean value; however, 
this improvement is so tiny and does not deserve the exten¬ 
sion of simulation time. 

With this stochastic simulation case study, we applied HSS 
as the default technique with 1,000 as the default sampling 
number in GREET according to the performance compari¬ 
son among these four techniques. If the user picks LHHS 
technique for stochastic simulation, the optimal number of 
samples is 1,500. In case of LHS technique, there is a clear 
difference in the smoothness between the 2,000 and 3,000 
sample cases. But this difference costs 22 min longer of CPU 
time for the 3,000 sample case. This trade-off needs to be 
carefully assessed by the user based on his/her preferences. 
Similar to LHS, an acceptable level of smoothness was 
reached with -3,000 samples for the MCS technique. There¬ 
fore, HSS has achieved more than 200% reduction in run¬ 
ning time without compromising on the accuracy and qual¬ 
ity of the prediction curves while LHHS has achieved a 150% 
reduction. This is due to the fact that for both techniques, a 
lesser number of samples are required for representing the 
sample space with substantial uniformity. 

3 Conclusions 

A new stochastic simulation tool has been developed to be 
built in the Argonne's GREET model to enhance the capa¬ 
bility for addressing the uncertainties incorporated in a wide 
variety of input parameters. This tool has been designed as 




Fig. 7: The output distribution curves for the forecast variable 'WTW total energy use of LH2 FCV' using LHS with: a) 2,000 samples; and b) 3,000 samples 
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a Microsoft® Excel add-in file with Visual Basic macros which 
can be loaded whenever the user needs to perform a stochas¬ 
tic simulation within the model. This new tool automates the 
process of setting up a stochastic simulation to a great extent 
and guides the user in each step of the process through the 
user-friendly graphical user interface (GUI) windows. 

In general, totally five steps should be followed to perform 
a complete stochastic simulation through the new tool: 

a) Cell Input: Specify probability distribution functions to 
the input variables; 

b) Sampling: Indicate the number of samples required and 
the sampling technique to be used; 

c) Forecast Cells: Define the forecast variables; 

d) Delete Distribution: To delete a distribution defined pre¬ 
viously; 

e) Run Simulation: Propagate the uncertainties and statis¬ 
tically analyze the outputs. 

The new version, GREET1.7, includes more than 90 fuel pro¬ 
duction pathways and more than 70 vehicle/fuel systems. As a 
result, the model contains more than 700 default distribution 
functions for various key parameters, such as energy efficien¬ 
cies, GHG emissions factors, as well as criteria pollutant emis¬ 
sion factors, for each WTW stage; and as many as -3,000 
forecast variables for stochastic simulation. During the sto¬ 
chastic simulation, the selection of sampling technique is a 
key factor. The new tool incorporates four sampling techniques 
for GREET stochastic simulations: Monto Carlo sampling, 
Latin Hypercube sampling, Hammersley Sequence sampling, 
and Latin Hypercube Hammersley sampling. 

In this paper, we did 16 independent stochastic simulations 
in the GREET model to evaluate the performance of the 
four sampling techniques. The main findings are as follows: 

a) Despite the selection of sampling technique and the num¬ 
ber of samples (between 1,000 and 4,000), the simulated 
mean values are close to the actual mean value enough 
(within ±1%). The standard deviation values from each 
other are close enough as well (within ±5%). It shows the 
trend that the increasing number of samples makes the 
simulated mean value marginally closer to the actual mean 
value; however, the improvement effect is negligible. 

b) With the same number of samples, the output distribu¬ 
tion curve simulated by HSS is the smoothest correspond¬ 
ing to the highest level of uniformity, followed by LHHS 
and while MCS and LHS are almost the same with re¬ 
spect to smoothness factor. 

c) To achieve the same level of smoothness as HSS with 
1,000 samples, LHHS needs to be simulated with -1,500 
samples and LHS and MCS with -3,000 samples. Be¬ 
cause the simulation time is almost the same for running 
each sample by these four techniques, HSS can achieve 
more than 200% reduction in running time without com¬ 
promising on the accuracy and quality of the prediction 
curves while LHHS can achieve a 150% reduction com¬ 
pared to LHS or MCS. 

d) The simulation time is strictly positive-correlated with 
the number of samples; therefore, the trade-off of ex¬ 
tending simulation time and improving smoothness of 
output distribution curve needs to be carefully assessed 
by the user based on his/her preferences. 


According to the performance comparison among these four 
techniques, we applied HSS as the default technique with 
1,000 as the default sampling number in GREET. 
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